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OFFICE  OF  NAVAL  RESEARCH 
FINAL  REPORT 

•  Contract/Grant  Number:  N00014-91-J-1850 

•  Contract/Grant  Title:  Application  of  Nonlinear  Signal  Processing 
Techniques  to  Chemical  and  Transport  Processes 

•  Principal  Investigators:  J.  L.  Hudson  and  I.  G.  Kevrekidis 

•  Mailing  Address:  Chemical  Engineering,  Thornton  Hall,  University 
of  Virginia,  Charlottesville,  Va.  22903-2442 

and  Chemical  Engineering,  Princeton  University,  Princeton,  N.  J. 

•  Phone  Numbers  (with  Area  Code):  804-924-6275  (Virginia) 
609-258-4581  (Princeton) 

•  E-Mail  Addresses:  hudson@virginia.edu, 
yannis@arnold.princeton.edu 

1.1  RESEARCH  ACCOMPLISHMENTS: 

During  the  course  of  the  grant  we  have  carried  out  -as  outlined  in  the  original 
proposal-  computational  and  experimental  studies  of  nonlinear  reaction  and 
transport  processes,  collected  time  series  and  image  series  from  these  pro¬ 
cesses,  and  we  also  developed,  implemented,  tested  and  applied  algorithms 
for  nonlinear  signal  processing  and  system  identification  on  these  time  series. 
We  also 

•  developed  (in  collaboration  with  the  T-13  group  at  Los  Alamos  Na¬ 
tional  Laboratory)  an  integrated  computational  environment  for  the 
generation  of  such  nonlinear  models  from  time  series,  based  on  artifi¬ 
cial  neural  networks; 
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•  studied  both  computationally  and  experimentally  the  dynamics  of  adap¬ 
tive  control  algorithms; 

•  collaborated  with  the  groups  of  Dr.  Carlos  Garcia  at  Shell  and  of  Dr. 
Barry  Tarmy  at  Exxon; 

•  motivated  by  the  model  reduction  and  image  processing  work,  con¬ 
structed  a  new  class  of  microstructured  and  composite  catalytic  mate¬ 
rials. 

Detailed  research  accomplishments  (both  planned  in  the  original  pro¬ 
posal  and  new  directions  that  arose  as  parts  of  the  research  over  the  years) 
are  described  below,  with  references  to  the  attached  publication  list.  We  will 
include  a  short  description  of  a  number  of  ongoing  projects  which  have  been 
motivated  by  this  work,  and  which,  when  completed,  will  still  acknowledge 
this  grant. 

1.1.1  Time  Series  Analysis  Collection  and  Analysis 

The  bulk  of  the  research  consisted  of  the  application  of  artificial  neural  net¬ 
work  (ANN)  based  techniques  to  the  processing  of  experimental  (and  occa¬ 
sionally  computational)  time  series.  The  experimental  time  series  for  metal 
electrodissolution  were  obtained  at  our  laboratory  at  the  University  of  Vir¬ 
ginia  (Cu  electrodissolution  in  phosphoric  acid,  metastable  pitting  of  A1  and 
Al-Cu  alloys  in  halide  solutions);  experimental  time  series  from  thermal  con¬ 
vection  were  obtained  in  collaboration  with  Dr.  R.  E.  Ecke,  of  the  MST-10 
division  at  Los  Alamos  National  Laboratory. 

Our  original  collaboration,  on  which  the  proposal  was  based,  appeared 
in  Chem.  Eng.  Science  in  1990;  from  the  ANN  point  of  view,  its  original¬ 
ity  consisted  of  the  incorporation  of  an  additional  input  neuron  to  model 
the  effect  of  an  operating  parameter  on  the  system  dynamics.  This  allowed 
the  modeling  of  an  oscillatory  and  a  period-doubling  instability  for  Cu  elec¬ 
trodissolution  in  phosphoric  acid  solutions,  and  we  are  happy  to  report  that 
the  paper  has  been  reprinted  in  a  book  entitled  “Artificial  Neural  Networks, 
Forecasting  Time  Series,”  V.  Rao  Vemuri  and  Robert  D.  Rogers,  eds.,  IEEE 
Computer  Society  Press,  Washington  (1994)  (ISBN  0-8186-5120-2). 

This  work  continued  and  resulted  in  our  Chem.  Eng.  Communica¬ 
tions  publication  in  1992.  Here,  Cu  electrodissolution  and  a  sequence  of 
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period-doublings  leading  to  chaotic  dynamics  was  studied  carefully  and  in 
detail.  Furthermore,  the  paper  contained  two  innovations  in  our  neural  net¬ 
work  research.  The  first  (which  was  part  of  the  title  of  the  paper)  was  the 
development  and  implementation  of  a  new  class  of  neural  network  architec¬ 
tures  -templated  on  numerical  integrators-  which  could  yield  continuous-time 
models  as  opposed  to  discrete-time  ones.  This  development  allowed  for  the 
qualitatively  correct  interpretation  of  instabilities  and  bifurcations,  which 
“traditional”,  discrete-time  ANN  or  other  models  cannot  capture.  This  work 
has  received  some  recognition,  and  recently  we  made  our  codes  available  to 
Professor  T.  McAvoy,  of  the  University  of  Maryland.  The  second  devel¬ 
opment  was  the  use  of  so-called  “nonlinear  principal  components”  for  the 
preprocessing  of  the  time  series;  the  results  of  this  preprocessing  are  then  fed 
to  the  continuous-time  ANN  algorithms. 

Our  1994  Physica  D  paper  on  processing  time  series  from  the  quasiperi- 
odic  regime  of  Rayleigh- Benard  convection  did  not  contain  similar  novelties 
in  neural  network  architectures;  nevertheless,  it  was  the  first  to  attempt  to 
explain  truly  complex  sequences  of  global  bifurcations  for  Poincare  maps  in  a 
very  rich  dynamical  regime;  this  paper  was  initially  a  Los  Alamos  report,  was 
made  electronically  available  at  Los  Alamos,  and  we  received  a  large  number 
of  copy  requests  within  a  few  days  of  its  appearance. 

After  successfully  completing  the  analysis  of  the  thermal  convection 
data,  our  efforts  on  the  development  of  neural  network  algorithms  turned  to 
the  novel  class  of  integrator-based  architectures  we  proposed  for  continuous¬ 
time  system  identification.  While  in  our  Chem.  Eng.  Communications  pa¬ 
per  we  proposed  the  use  of  iterated  neural  networks  -based  on  integrators 
like  Runge-Kutta-type  algorithms-  it  became  clear  that  for  “stiff’  systems 
one  would  need  ANN  architectures  based  on  implicit  integrators,  and  these 
give  rise  to  recurrent  neural  networks.  The  efficient  training  of  such  net¬ 
works  (currently  done  by  essentially  “feedforward”  algorithms,  like  the  one 
proposed  by  Pineda  in  1987),  is  still  an  open  problem,  and  we  have  made 
significant  progress  in  that  direction,  as  discussed  also  below  in  the  algorithm 
section.  We  have  made  several  presentations  on  this  work,  including  a  ref¬ 
ereed  proceedings  paper  at  the  1993  IEEE  NN  Conference  in  San  Francisco, 
a  proceedings  paper  in  the  1993  6th  SIAM  Conference  on  Parallel  Comput¬ 
ing  and  Applications,  as  well  as  a  proceedings  paper  in  the  1993  American 
Control  Conference  in  San  Francisco. 

This  work  has  also  naturally  branched  towards  the  identification  of 
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gray  box  models,  based  on  neural  network  architectures;  the  first  publication 
along  these  lines  has  been  a  refereed  proceedings  paper  in  the  1994  IEEE 
Workshop  on  Neural  Networks  for  Signal  Processing.  We  are  currently  con¬ 
tinuing  this  work  towards  parallel  implementation  of  these  algorithms  using 
PVM  on  an  IBM  SP2  parallel  computer,  a  small  part  of  which  was  financed 
through  this  grant.  Both  of  the  last  directions  (recurrent  black  box  ANNs 
templated  on  implicit  integrators  as  well  as  gray-box-type  ANNs,  also  based 
on  implicit  integrators)  are  still  amenable  to  the  development  of  new  train¬ 
ing  algorithms  to  exploit  massively  parallel  architectures.  While  Pineda- type 
algorithms  “fit”  SIMD  machines,  some  of  the  more  “exact”  algorithms  we 
have  developed  and  implemented  in  scalar  architectures  and  environments 
like  PVM  can  be  much  better  fitted  to  MIMD  machines,  and  that  is  a  sub¬ 
ject  we  will  continue  to  do  research  on. 

A  final  aspect  of  our  ANN  research  was  the  realization  (motivated 
by  our  study  of  adaptive  control  systems)  that  discrete-time  neural  network 
models  based  on  time  series  can  be  noninvertible,  i.e.  a  given  state  can  have 
more  than  one  “preimages”  backward  in  time.  This  is  a  pathology,  which  -as 
in  our  work  above  about  discrete-time-modeling  shortcomings-  must  be  kept 
in  mind,  and  tested  for,  before  one  trusts  the  predictive  power  of  discrete-time 
ANNs.  We  have  published  so  far  a  refereed  proceedings  paper  in  the  1993 
IEEE  NN  conference  proceedings,  as  well  as  a  paper  in  the  proceedings  of  the 
28th  annual  IEEE  conference  on  Information  Systems  and  Sciences  (1994)  on 
this  matter.  We  are  currently  completing  an  archival  journal  paper  on  this 
subject,  which  we  will  submit  to  IEEE  Journal  on  Neural  Networks  during 
this  academic  year. 

We  were  invited  to  write  a  chapter  on  these  two  issues  (discrete-  vs. 
continuous-time  ANN  models,  as  well  as  noninvertibility  in  ANNs)  in  an 
Elsevier  book  (“Neural  Networks  for  Chemical  Engineers”,  A.  Bulsari  ed.); 
we  were  just  notified  that  the  book  has  been  published. 

On  the  study  of  metastable  pitting  of  aluminum  and  aluminum  alloys 
an  intensive  experimental  and  mathematical  study  program  was  initiated  at 
Virginia.  It  is  known  that  metastable  pitting  occurs  at  potentials  well  below 
the  pitting  potential  in  many  alloy-electrolyte  systems.  An  understanding  of 
the  metastable  pitting  process  is  important  not  only  in  its  role  in  the  stable 
pitting  process  but  also  independently  in  miniaturized  components  such  as 
in  VLSI  circuits.  We  have  developed  an  apparatus  for  obtaining  low  noise 
current  signals  in  the  nanoamp  range.  Analysis  of  the  signals  {J.  Electrochem. 
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Soc.  (94))  shows  that  they  can  be  modeled  as  a  stochastic  process  with  the 
probability  of  initiation  being  dependent  on  previous  events. 

Professor  Hudson  was  also  invited  to  give  one  of  the  plenary  lectures 
in  the  2nd  Experimental  Chaos  Conference,  and  contributed  a  review  article 
on  “Chaos  during  heterogeneous  chemical  reactions”  in  the  Proceedings  of 
that  conference. 

The  chaotic  time  series  from  the  electrodissolution  of  copper  have  also 
been  analyzed  by  a  related  global  vector  field  method  in  cooperation  with  a 
group  at  the  Laboratoire  d’Energetique  des  Systemes  et  Precedes  at  Rouen, 
France.  In  this  work  (which  just  appeared  in  J.  Phys.  Chem.),  it  is  shown 
that  the  attractor  obtained  from  the  reconstructed  system  is  topologically 
equivalent  to  the  attractor  obtained  directly  from  the  experimental  data. 

1.1.2  Spatiotemporal  Dynamics 

One  of  the  main  issues  of  the  research  we  originally  proposed  was  to  move  be¬ 
yond  scalar  time  series,  and  develop  /  exploit  model  identification  techniques 
for  distributed  systems.  We  had  several  successes  in  this  direction,  and,  as 
we  will  discuss  below,  one  of  the  high  points  of  our  work  (quite  unforeseen 
at  the  beginning)  came  from  this  research  direction. 

The  first  success  we  had  came  from  a  collaboration  with  the  group  of 
Professor  G.  Ertl,  at  the  Fritz  Haber  Institut  of  the  Max  Planck  Gesellschaft 
in  Berlin,  Germany.  This  is  simply  one  of  the  leading  surface  science  / 
catalysis  /  electrochemistry  groups  in  the  world.  In  the  late  80s  they  de¬ 
veloped  an  electron  microscopy  technique  called  PEEM,  or  photoemission 
electron  microscopy,  which  allows  the  real-time,  micrometer  resolution  ob¬ 
servation  of  reactant  adsorbate  coverages  on  metal  catalysts  in  reactions  like 
the  CO  oxidation  on  Pt,  or  NO  and  CO  on  Pt  etc.  This  microscopy  revealed 
that  catalytic  rate  oscillations  are  not  spatially  uniform,  and  that  a  bewil¬ 
dering  variety  of  spatiotemporal  two-dimensional  patterns  (spirals,  stripes, 
targets,  hexagons,  chemical  turbulence)  happens  on  the  catalyst  surface  in 
real  time  and  on  a  five-  to  ten  micrometer  scale.  In  collaboration  with  a 
DFG-postdoctoral  Fellow  in  Princeton  (Dr.  Katharina  Krischer)  we  ana¬ 
lyzed  real-time  spatiotemporal  video  PEEM  data,  and  were  able  to  reduce 
image  series  (of  the  order  of  60,000  time  series)  to  only  four  relevant  time 
series,  based  on  principal  component  analysis.  Subsequently,  we  were  able 
to  construct  an  ANN-based  model  that  successfully  predicted  the  spatiotem- 
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poral  behavior;  this  input-output  model  had  four  inputs  and  one  time-delay, 
hence  a  total  of  eight  degrees  of  freedom  -  a  surprising,  even  though  not 
completely  unexpected  reduction  since  the  data  did  exhibit  spatial  coher¬ 
ence.  The  paper  that  resulted  from  this  work  was  published  in  the  AICHE 
Journal  in  1993,  had  excellent  reviews,  was  the  first  paper  in  the  Journal  to 
contain  color  pictures,  and  was  excerpted  in  Chemical  Engineering  Progress 
in  early  1993. 

In  our  invited  chapter  (mentioned  above,  which  just  was  published) 
we  were  able  to  further  reduce  the  degrees  of  freedom  in  the  model  using 
the  so-called  “Non-Linear  Principal  Components”;  this  ended  up  giving  us  a 
probably  minimal  three-degree  of  freedom  model,  which  completely  predicted 
the  spatiotemporal  image  series  in  continuous-time  -  with  just  three  scalar 
initial  conditions. 

A  sequence  of  that  work  was  the  analysis  (without  modeling)  of  spa¬ 
tiotemporal  data  from  another  catalytic  reaction,  the  NO  -|-  CO  reaction, 
from  the  group  of  Dr.  Ronald  Imbihl  in  Prof.  Ertl  s  Laboratory.  This 
work  was  done  mainly  by  a  joint  Virginia-Princeton  postdoctoral  fellow.  Dr. 
Michael  Graham,  partially  supported  through  this  grant,  who  now  is  an 
Assistant  Professor  of  Chemical  Engineering  at  the  University  of  Wisconsin- 
Madison.  The  paper  from  this  work  is  finally  in  press  in  Chaos,  Solitons  and 
Fractals  -  we  were  sent  the  page  proofs  last  month. 

An  important  experimental  breakthrough  at  Prof.  Hudson  s  lab  in 
Virginia,  was  the  discovery  of  spatiotemporal  oscillations  in  two  spatial  di¬ 
mensions  in  the  electrodissolution  of  iron  -  a  circular  electrode,  close  to  the 
Flade  potential,  dissolves  in  a  time-dependent  manner;  the  oscillations  are 
associated  with  the  formation  and  dissolution  of  a  film,  which  clearly  shows 
spatiotemporal  symmetry  breaking.  The  data  were  collected  at  Virginia  and 
processed  in  Princeton,  and  the  publication  has  appeared  in  Physics  Letters 
A  in  1993.  A  similar  study  has  also  been  carried  out  on  a  ring  geometry 
{Ind.  &  Eng.  Chem.,  1995).  The  spatiotemporal  period  doubling  is  again 
observed;  this  is  followed  by  another  symmetry  breaking  resulting  in  pat¬ 
terns  on  four  quadrants  of  the  ring.  This  work  continues  in  the  laboratory 
of  Professor  Hudson. 

An  unexpected  development  from  our  catalytic  pattern  formation 
work  was  the  idea  and  the  subsequent  construction  and  testing  of  microstruc- 
tured  heterogeneous  catalysts.  The  motivation  was  really  mathematical,  sim¬ 
ulations  of  pattern  formation  with  reaction-diffusion  models  are  done  in  finite 
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computational  domains,  which  contain  usually  a  few  resolved  “features”  (spi¬ 
rals,  waves,  fronts).  The  actual  Berlin  experiments  were  done  on  catalysts  of 
typical  dimension  1cm,  while  the  typical  pattern  size  is  5-10  microns.  Obvi¬ 
ously,  it  would  make  sense,  in  order  to  compare  experiments  with  theory,  to 
do  experiments  in  smaller  domains,  say  20  to  50  microns  wide.  We  thought 
of  using  lithography  to  construct  such  domains  on  Pt  catalysts,  and  used 
titanium  as  the  inert  “building  material”  for  “fencing  in”  finite  catalyst  do¬ 
mains.  This  work  has  led  to  an  amazing  variety  of  spatiotemporal  patterns, 
and  has  allowed  the  detection  of  several  new  phenomena  involving  the  inter¬ 
actions  of  patterns  with  boundaries.  We  were  fortunate  to  have  the  original 
paper  from  this  work  published  in  Science  in  1994,  while  the  long  version  of 
this  work  is  currently  in  press  in  Physics  Reports  E.  The  work  started  with 
Dr.  Michael  Graham,  a  joint  postdoc  partially  supported  through  this  grant, 
and  continues  with  Dr.  Markus  Baer,  a  DFG  Fellow  in  Princeton.  There  are 
several  forthcoming  papers  motivated  by  and  continuing  this  work  (like  a 
recent  submitted  Phys.  Rev.  Letter^,  and  there  is  a  novel  direction  for  it 
described  in  the  “other  initiatives”  section  below. 


1.1.3  Software  Development 

Parallel  Algorithms  for  Neural  Network  Training 

At  the  T-13  group  of  Los  Alamos  National  Laboratory  the  two  investigators 
involved  in  this  research  project  (Dr.  Alan  Lapedes  and  Rob  Farber)  have 
been  developing,  since  1986,  a  Neural  Network  Compiler  system.  This  ef¬ 
fort  formed  one  of  the  bases  of  the  research  proposed  and  performed  under 
this  grant.  This  Compiler  system  allows  us  to  specify  an  arbitrary  neural 
network  architecture  and  an  arbitrary  training  set  (in  the  appropriate  for¬ 
mat),  and  constructs  an  efficient  neural  network  simulation  for  the  specified 
destination  computer.  The  target  architecture  for  the  production  runs  for 
nonlinear  system  identification  based  on  time-  and  image-series  processing 
for  this  research  was  initially  the  Thinking  Machines  CM-2  and  CM-200  at 
Los  Alamos. 

As  part  of  this  grant,  in  addition  to  the  production  runs  for  the  various 
physicochemical  and  engineering  systems  we  studied,  and  which  have  been 
reported  in  the  publications  that  resulted  from  this  work.  The  Los  Alamos 
team  was  able  to  make  two  critical  additions  to  the  compiler  system.  These 
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changes  were  crucial  in  enabling  us  to  study  some  of  our  test  cases.  The 
first  was  incorporation  of  the  ability  to  simulate  recurrent  neural  networks 
(networks  with  both  feedforward  and  feedback  connections).  The  second  was 
to  add  the  ability  to  generate  efficient  simulations  for  the  Thinking  Machines 
CM-5  line  of  parallel  supercomputers. 

Adding  these  two  new  paradigms,  recurrent  neural  networks  and  sup¬ 
port  for  a  MIMD  (Multiple  Instruction  Multiple  Data)  model  computers, 
required  significant  rethinking  and  restructuring  of  the  internals  of  the  com¬ 
piler.  Since  the  compiler  attempts  to  efficiently  exploit  every  feature  of  the 
destination  architecture,  almost  every  facet  of  our  system  from  the  model 
generation,  dependency  analysis,  code  optimization,  loader/linker,  pseudo¬ 
code  emulation  and  destination  machine  code  generation  had  to  be  re-thought 
and  modified. 

The  efficient  simulation  of  recurrent  neural  networks  is  still  an  open 
question  in  the  neural  network  literature.  There  is  no  definitive  method  for 
the  training  and  evaluation  of  recurrent  neural  networks  without  substantial 
computational  effort.  Neural  networks  are  “trained”  by  adjusting  parameters 
within  the  network  architecture  so  that  the  network  can  reconstruct  the  de¬ 
sired  output  vectors  of  the  training  set  given  the  input  vectors  with  a  minimal 
error.  This  is  generally  accomplished  by  iteratively  evaluating  and  changing 
the  parameters  of  the  neural  network  according  to  some  optimization  pro¬ 
cedure  (such  as  Conjugate  Gradients).  However,  the  addition  of  recurrent 
connections  to  the  neural  network  architecture  requires  that  for  each  input 
vector  the  network  must  somehow  reach  a  fixed  point  before  the  match  with 
the  target  output  vector  can  be  computed.  This  adds  serious  complications, 
as  the  iterative  procedure  necessary  for  the  simple  evaluation  of  the  network 
output  may  not  converge  to  a  fixed  point  but  may  instead  oscillate  or  worse 
diverge.  We  used  the  method  of  Pineda  for  the  implementation  of  recurrent 
network  architectures  in  our  compiler.  An  important  characteristic  of  this 
algorithm  is  that  the  network  must  be  evaluated  in  a  feedforward  manner 
a  number  of  times  for  each  input  vector.  We  discovered  that  our  accuracy 
requirements  dictated  a  high  number  of  feedforward  iterations  for  each  in¬ 
put  vector.  This  could  in  general  add  up  to  two  orders  of  magnitude  to  the 
runtime  growth  of  our  simulations.  Our  compiler  modifications,  as  well  as 
significant  amounts  of  computer  time  were  therefore  necessary  to  test  and 
use  these  algorithms. 

The  compiler  gets  its  speed  on  a  parallel  computer  by  mapping  the 
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computation  to  the  hardware,  so  that  it  becomes  a  purely  computational 
problem  with  minimal  communications  overhead.  During  execution  of  a 
feedforward  neural  network  code,  which  is  being  trained  according  to  a  least 
mean  squares  criterion  on  the  CM-5,  the  only  communication  required  is 
that  of  a  global  summation  across  each  of  the  output  neurons.  We  perform 
this  mapping  ourselves,  because  current  parallel  machines  have  a  high  cost 
in  communicating  arbitrarily  between  processors.  On  the  CM-5,  the  over¬ 
head  for  this  communication  can  be  as  high  as  1000  times  the  cycle  time 
of  the  computer.  Conversely,  communicating  from  one  processor  to  all  the 
other  processors  generally  requires  only  one  cycle.  Our  simulations  exploited 
both  these  characteristics  of  the  communications  to  efficiently  map  the  neural 
network  simulation  to  the  computational  hardware,  so  that  each  processor 
spends  its  time  calculating  instead  of  waiting  for  data  communications.  The 
overall  compiler  paradigm  can  be  seen  in  the  block  diagram  of  Figure  1.  The 
compiler  is  also  intelligent  enough  to  maintain  the  recurrent  variables  within 
the  local  processor  memory  so  that  a  recurrent  neural  network  run  requires 
an  expensive  global  summation  only  after  the  network  has  iterated  towards 
a  fixed  point.  Efficiently  doing  floating  point  calculations  on  the  CM-5  archi¬ 
tecture  has  the  additional  complication  that  all  the  data  be  formatted  so  that 
it  can  be  accessed  and  efficiently  loaded  into  the  vector  pipeline  of  the  local 
processor  vector  unit.  A  large  number  of  calculations  can  be  avoided  if  the 
data  to  be  calculated  is  boolean  in  nature  (i.e.  having  values  of  only  zero  or 
one).  Depending  on  the  network  architecture,  the  compiler  can  have  neurons 
and  connections  specified  as  being  locked  to  a  constant  value  or  as  “equiv- 
alenced”.  This  was  crucial  in  implementing  some  of  the  “continuous- time” 
identification  architectures  proposed  by  the  Princeton  group. 

To  facilitate  interoperability  between  the  Los  Alamos  and  Princeton 
software  tools,  we  implemented  the  expression  of  the  neural  network  predic¬ 
tion  as  either  a  pseudo-code  suitable  for  immediate  evaluation  within  the 
automated  framework  of  the  Los  Alamos  compiler  tools,  or  as  either  a  C  or 
FORTRAN  subroutine  suitable  for  easy  integration  into  a  variety  of  software 
packages  including  those  developed  and  used  at  Princeton.  Additionally,  we 
implemented  several  “pruning”  heuristics  within  the  training  of  the  compiler 
to  minimize  the  size  and  connectivity  of  the  final  trained  network.  We  noted 
two  important  effects: 

•  For  many  problems  the  network  can  sustain  the  loss  of  many  parameters 
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Figure  1:  Block  Diagram  of  Neural  Network  System 


Figure  2:  CM-5  timings  for  a  fixed  architecture 
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(neurons  and  connections)  without  affecting  its  generalization  proper¬ 
ties;  and 

•  The  resulting  prediction  code  could  gain  many  multiples  in  speed  in¬ 
crease. 

Both  effects  were  helpful  in  our  analysis  of  the  trained  networks  since 
they  were  typically  simpler  and  real  time  evaluation  was  significantly  faster. 
Although  the  compiler  is  dependent  on  the  computing  and  communications 
environment,  we  have  been  able  to  use  our  code  at  a  number  of  different 
institutions  for  a  variety  of  collaborations  on  widely  varying  problems.  We 
are  open  to  collaborations  with  researchers  on  problems  of  interest  for  which 
our  analytic  skills  and  computational  tools  can  provide  a  means  of  solution. 

To  optimize  the  training  procedure  of  the  computationally  efficient 
simulation  generated  by  our  compiler,  we  use  several  types  of  gradient  based 
optimization  algorithms.  These  methods  have  been  shown  to  be,  in  general, 
significantly  faster  at  finding  minima  than  non-gradient  based  ones.  Thus 
our  compiler  has  the  ability  to  calculate  the  symbolic  derivatives  of  the  spec¬ 
ified  neural  network  (recall  that  the  network  can  be  composed  of  arbitrary 
“neuron”  functions  as  well  as  an  arbitrary  energy  function).  As  long  as  the 
functions  used  are  differentiable,  the  compiler  can  determine  the  symbolic 
derivative  of  the  network  and  energy  function  as  a  whole  and  express  it  in 
efficiently  executable  code  for  the  destination  machine  (via  the  same  algo¬ 
rithms  which  efficiently  express  the  neural  network  function  itself).  To  the 
extent  that  nonlinear  optimization  is  indeed  a  “black  art”,  we  have  imple¬ 
mented  within  the  compiler  system  several  optimization  routines  so  we  can 
search  for  th  best-suited  to  solving  our  particular  problem. 

Empirical  results  indicate  that  our  method  of  mapping  and  utilizing 
the  CM-5  architecture  is  indeed  effective.  Thinking  Machines  has  acknowl¬ 
edged  our  implementation  of  neural  networks  on  the  CM-5  as  the  “most 
efficient  method  possible”  (Parallel  Computing,  14:305-315,  1990).  We  can 
see  in  Figure  2  that  for  a  fixed  neural  network  architecture,  the  runtime  of 
the  forward  pass  scales  well  with  the  number  of  processors.  This  of  course 
assumes  that  there  is  enough  data  within  the  training  set  to  keep  the  vector 
pipelines  within  the  CM-5  processors  fully  loaded.  The  data  for  the  plot  in 
figure  2  was  determined  by  taking  the  best  time  of  50  network  evaluations  on 
the  32,  64,  and  128  node  partitions  of  the  Los  Alamos  CM-5.  Each  partition 
was  running  in  time  sharing  mode.  The  neural  network  architecture  was  kept 


11 


fixed  for  all  the  runs.  We  were  unable  to  run  on  the  1024  node  partition  of 
the  Los  Alamos  CM-5  to  provide  timings  on  larger  partitions.  We  have  tim¬ 
ings  for  the  Naval  Research  Laboratory  1024  node  CM-5  (the  use  of  which 
was  graciously  granted  to  us  by  ONR  through  this  grant)  but  we  have  not 
included  them  here  as  the  NRL  machine  uses  different  speed  processors  and 
hence  would  not  provide  an  equivalent  hardware  platform. 

Simulation  and  Stability /Bifurcation  Software 

The  output  of  the  above  training  procedure  is,  in  subroutine  form,  a  discrete¬ 
time  neural  network  state-space  model.  This  subroutine  (along  with  an  addi¬ 
tional  subroutine  that  calculates  first  derivatives  of  the  model  with  respect  to 
its  variables  and  parameters)  is  then  input  in  a  general  purpose  simulation/ 
stability  analysis/  visualization  package  that  we  have  been  developing  over 
many  years  in  Princeton. 

The  package  is  called  SCIGMA,  and  it  can  be  used  to  study  dis¬ 
crete  dynamical  systems  (like  the  maps  produced  by  traditional  discrete¬ 
time  ANNs)  as  well  as  continuous  dynamical  systems  (ODEs,  produced  by 
our  novel  continuous-time  ANNs)  and  also  discretized  PDEs. 

We  have  made  this  package  available  via  FTP  from  Princeton  to  many 
researchers  in  the  US  and  abroad,  and  we  also  have  used  it  in  teaching  grad¬ 
uate  and  undergraduate  classes  in  Princeton  (such  as  Differential  Equations 
and  Introduction  to  Nonlinear  Dynamics). 

The  package  is  capable  of  interactively 

•  dynamically  simulate  maps,  ODEs  and  discretized  PDEs  (currently  the 
maximum  system  dimension  is  around  100  for  real-time  purposes); 

•  perform  real-time  three-dimensional  interactive  visualization  of  the  dy¬ 
namics  on  selected  projection  of  the  phase  space; 

•  allow  the  additional  real-time  visualization  of  spatiotemporal  patterns 
by  reconstructing  1-D  (and  even  2-D)  PDE  solutions  in  physical  space 
and  playing  them  as  movies  in  time; 

•  locate  and  analyze  the  stability  of  fixed  and  periodic  points  for  maps; 

•  locate  and  analyze  the  stability  of  steady  states  of  ODEs  and  (dis¬ 
cretized)  steady  states  of  PDEs; 
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•  locate  and  analyze  the  stability  of  limit  cycles  of  ODEs  and  PDEs 
finding  them  and  their  eigenvalues  (Floquet  multipliers)  as  fixed  points 
of  interactively  determined  Poincare  maps; 

•  approximate  1-dimensional  invariant  manifolds  for  fixed  or  periodic 
points  of  maps; 

•  approximate  1-dimensional  invariant  manifolds  for  steady  states  of  ODEs; 

•  approximate  two-dimensional  invariant  manifolds  for  steady  states  of 
ODEs; 

•  approximate  two-dimensional  invariant  manifolds  for  limit  cycles  of 
ODEs; 

•  record  and  replay  a  session,  automatically  scanning  a  parameter  range; 

•  generate  video  movies  of  a  session  with  annotations; 

•  for  the  case  of  discrete  neural  networks  that  are  noninvertible  we  had 
to  add  additional  capabilities,  like  computing  the  (several  possible) 
inverse  maps  and  keeping  track  of  the  exploding  number  of  successive 
preimages  backward  in  time,  as  well  as  the  calculation  of  the  so-called 
critical  curves;  this  is  an  experimental  version  of  the  code,  and  this 
part  has  not  been  made  publicly  available. 

In  addition,  we  have  built  an  interactive  interface,  running  on  SGI 
machines  using  the  GL  library,  for  the  general  purpose  bifurcation  package 
AUTO,  with  the  approval  of  the  package’s  author.  Professor  E.  Doedel  of 
Concordia  University  in  Montreal,  Canada. 

Our  “Interactive  AUTO”  allows  the  real-time  visualization  of  contin¬ 
uation  and  stability  calculations  for 

•  one-parameter  continuation  of  steady  states  of  ODEs; 

•  one-parameter  continuation  of  limit  cycles  for  ODEs; 

•  one-parameter  continuation  of  fixed  and  periodic  points  for  maps; 

•  detection  and  two-parameter  continuation  of  turning  points; 
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•  detection  and  two-parameter  continuation  of  Hopf  points; 

•  detection  and  two-parameter  continuation  of  period-doubling  points 
etc. 


The  original  program  itself  had  these  scientific  computing  capabili¬ 
ties  but  in  a  batch  processing  environment.  Our  contribution  was  to  make 
that  real-time  interactive  with  real  time  data  visualization,  and  we  were  also 
able  to  “tailor”  the  FORTRAN  or  C  output  of  our  neural  network  training 
algorithms  to  the  subroutines  necessary  for  linking  into  this  program.  This 
procedure  is  not  completely  automated  (some  hand-editing  has  to  be  done 
before  compilation),  but  could  in  principle  easily  become  so,  especially  by 
running  the  training  output  through  a  symbolic  manipulator  like  Mathemat- 
ica. 

It  is  worth  adding  that  we  are  currently  “experimentally”  working 
on  modifications  of  these  packages  for  noninvertible  systems,  for  2-D  PDFs, 
for  getting  the  two  packages  to  completely  communicate  etc.  It  is  also  worth 
adding  that  these  packages  can  be  used  with,  in  principle,  any  neural  network 
model;  they  study  dynamics  of  the  model  one  puts  in;  it  is  not  necessary  that 
this  model  “come”  from  one  of  our  own  neural  network  training  algorithms. 

Finally,  we  should  also  mention  that  in  the  process  of  this  work  we 
had  to  develop  (with  the  assistance  of  the  Interactive  Computer  Graphics 
Laboratory,  in  Princeton)  a  number  of  programs  for  the  digital  access,  pro¬ 
cessing  and  real  time  visualization  of  video  data,  using  optical  disks  and  SGI 
computers.  At  the  time  that  we  were  performing  the  research,  no  standards 
for  this  type  of  work  were  available;  this  seems  to  be  changing  rapidly. 

1.1.4  Other  Developments 

Adaptive  control 

One  of  the  main  motivations  of  our  research  on  nonlinear  signal  processing 
and  model  identification  was  the  eventual  use  of  these  models  for  control 
purposes.  Our  original  connection  with  the  Control  group  at  Shell  Research 
and  Engineering  was  Dr.  Melinda  Golden,  who  was  working  on  adaptive 
control  problems.  We  started  research  on  Model  Reference  Adaptive  Control 
both  theoretically  and  computationally  before  this  proposal,  and  then  as  part 


14 


of  this  proposal  we  studied  the  effect  of  plant/model  mismatch  in  adaptive 
control.  This  resulted  in  one  publication  co-authored  with  Dr.  Golden  (Pro¬ 
ceedings  of  a  NATO  summer  school  in  1992)  as  well  as  the  discovery  and  sub¬ 
sequent  analysis  of  a  new  secondary  instability  in  an  adaptive  control  system 
[SIAM  J.  Math.  Anal.  (1995)).  We  have  also  done  experimental  work  on  the 
adaptive  control  of  a  mixing  tank,  which  has  been  published  in  the  Proceed¬ 
ings  of  the  1992  ACC.  Shell  R&E  initially  supported  this  research  financially, 
and  later  they  donated  to  the  Princeton  group  a  two-tank  adaptive  control 
experiment,  which  we  had  previously  analyzed  theoretically,  and  which  was 
studied  in  the  Senior  Thesis  of  an  undergraduate  student  in  Princeton.  As 
part  of  this  work  we  also  consider  our  comparison  of  a  priori  theoretical  and 
semi-empirical  methods  for  model  reduction  in  distributed  nonlinear  systems 
(a  study  of  the  relation  between  Approximate  Inertial  Manifold  techniques 
and  the  POD,  or  Karhunen-Loeve  expansion,  with  Dr.  Michael  Graham, 
which  is  currently  in  press  in  Computers  and  Chemical  Engineering  (1995)). 
We  are  very  much  still  interested  in  the  use  of  our  ANN-identified  models  in 
model  based  control,  and  hope  to  continue  research  along  these  lines. 

Industrial  Time  Series 

During  the  first  year  of  this  work  a  graduate  student  from  Princeton  partially 
supported  through  this  grant  (Dr.Christos  Frouzakis)  visited  Shell  R&E  for 
the  summer  at  Westhollow,  Houston,  Texas,  and  worked  with  Dr.  Garcia’s 
group  in  the  construction  of  ANN-based  models  of  a  coal  gasification  process. 
The  time  series  were  obtained  by  Shell  (from  a  coal  gasification  plant;  the 
plant  was  under  control  -“controllers  not  tightly  tuned”-  and  inputs  where 
generated  during  pseudo-random  binary  noise  -PRNB-  testing)  and  were 
“shifted”  in  an  unknown  (to  us)  way  before  we  were  allowed  to  work  with 
them.  The  work  led  to  a  “second  proposition”  by  Dr.  Frouzakis,  entitled 
“Neural  Network  Identification  of  Real  Plant  Data”,  and  was  of  practical 
interest  to  Shell.  Unfortunately,  as  part  of  restructuring  at  Shell,  that  Con¬ 
trol  group  was  radically  changed,  and  though  we  continue  being  in  contact 
with  them  (as  evidenced  by  their  donation  to  us  of  their  adaptive  control 
experiment)  the  interest  has  shifted  from  nonlinear  system  identification. 

We  also  have  discussed  with  Exxon  (in  New  Jersey)  the  possibility  of 
obtaining  data  from  trickle  and  fluidized  beds.  Exxon  did  partially  support 
our  work  through  an  Exxon  Education  Foundation  grant  ($10,000  a  year  for 
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the  last  three  years),  and  we  are  currently  discussing  with  them  the  donation 
of  a  magnetic  fluidized  bed,  which  would  allow  us  to  look  at  pattern  formation 
in  such  gas-particle  flow  industrial  reactors.  We  did,  however,  in  preparation 
for  this  possibility,  perform  extensive  research  on  the  modeling  and  pattern 
(bubble)  formation  in  fluidized  beds,  and  a  long  and  systematic  paper  on 
this  was  just  submitted  to  the  J.  Fluid  Mech.  in  February  1995. 

Composite  Catalysts 

One  of  the  “unforeseen”  developments  of  this  research,  as  we  discussed  above, 
was  the  conception,  construction,  study  and  modeling  of  microstructured 
catalysts,  on  which  spatiotemporal  patterns  due  to  catalytic  reactions  can  be 
observed.  While  at  the  beginning  we  used  micropatterning  techniques  (such 
as  lithography)  to  “build  inert  fences”  on  catalytic  surfaces,  we  now  have  gone 
on  to  the  construction  of  composite  catalysts:  for  example,  we  can  -and  do- 
construct  checkerboard  patterns  of  one  catalyst  (say  Palladium)  on  a  single 
crystal  of  another  catalyst  (say  Pt)  at  a  few  microns  scale.  This  new  class 
of  composite  catalytic  materials  have  the  potential  of  drastically  changing 
the  overall  reactivity  or  selectivity  of  the  catalysts  for  some  reactions.  For 
example,  a  small  circle,  a  couple  of  monolayers  thick,  of  one  catalyst  on 
another  may  act  as  a  “pacemaker”  for  the  extended  surface  by  facilitating 
the  adsorption  of  a  key  species,  which  is  then  transported  through  surface 
diffusion  to  the  rest  of  the  surface. 

We  are  of  course  fortunate  that  the  relevant  scales  (microns)  are  at  the 
same  time  relevant  (comparable  to  surface  diffusion  length  scales),  accessible 
to  the  available  microscopies,  and  it  is  possible  to  use  available  microelec¬ 
tronics  fabrication  techniques  to  build  such  features  without  great  difficulty. 

We  are  very  actively  pursuing  this  work,  and  we  hope  that  it  may  lead 
to  important  technological  developments  in  catalyst  design  while  at  the  same 
time  providing  a  wealth  of  spatiotemporal  patterns  to  analyze  and  study. 

1.2  PUBLICATIONS 
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